On December 5th, the Byte Bean Bag model team launched the latest code model evaluation benchmark - FullStack Bench, covering over 11 real-world scenarios, supporting 16 programming languages, and including 3374 questions. Compared to previous evaluation standards, this benchmark can more accurately assess the code development capabilities of large models across a broader programming domain, facilitating the optimization of models in real-world programming tasks. Current mainstream code evaluation benchmarks, such as HumanEval and MBPP, typically focus on basic and advanced.